Search CORE

6 research outputs found

Open Vocabulary Object Detection with Proposal Mining and Prediction Equalization

Author: Chen Peixian
Li Ke
Lin Mingbao
Lin Shaohui
Ren Bo
Shen Yunhang
Sheng Kekai
Zhang Mengdan
Publication venue
Publication date: 24/11/2022
Field of study

Open-vocabulary object detection (OVD) aims to scale up vocabulary size to detect objects of novel categories beyond the training vocabulary. Recent work resorts to the rich knowledge in pre-trained vision-language models. However, existing methods are ineffective in proposal-level vision-language alignment. Meanwhile, the models usually suffer from confidence bias toward base categories and perform worse on novel ones. To overcome the challenges, we present MEDet, a novel and effective OVD framework with proposal mining and prediction equalization. First, we design an online proposal mining to refine the inherited vision-semantic knowledge from coarse to fine, allowing for proposal-level detection-oriented feature alignment. Second, based on causal inference theory, we introduce a class-wise backdoor adjustment to reinforce the predictions on novel categories to improve the overall OVD performance. Extensive experiments on COCO and LVIS benchmarks verify the superiority of MEDet over the competing approaches in detecting objects of novel categories, e.g., 32.6% AP50 on COCO and 22.4% mask mAP on LVIS

arXiv.org e-Print Archive

Revisiting Image Aesthetic Assessment via Self-Supervised Feature Learning

Author: Chai Menglei
Dong Weiming
Hu Bao-Gang
Huang Feiyue
Ji Rongrong
Ma Chongyang
Sheng Kekai
Wang Guohui
Zhou Peng
Publication venue
Publication date: 26/11/2019
Field of study

Visual aesthetic assessment has been an active research field for decades. Although latest methods have achieved promising performance on benchmark datasets, they typically rely on a large number of manual annotations including both aesthetic labels and related image attributes. In this paper, we revisit the problem of image aesthetic assessment from the self-supervised feature learning perspective. Our motivation is that a suitable feature representation for image aesthetic assessment should be able to distinguish different expert-designed image manipulations, which have close relationships with negative aesthetic effects. To this end, we design two novel pretext tasks to identify the types and parameters of editing operations applied to synthetic instances. The features from our pretext tasks are then adapted for a one-layer linear classifier to evaluate the performance in terms of binary aesthetic classification. We conduct extensive quantitative experiments on three benchmark datasets and demonstrate that our approach can faithfully extract aesthetics-aware features and outperform alternative pretext schemes. Moreover, we achieve comparable results to state-of-the-art supervised methods that use 10 million labels from ImageNet.Comment: AAAI Conference on Artificial Intelligence, 2020, accepte

arXiv.org e-Print Archive

Association for the Advancement of Artificial Intelligence: AAAI Publications

Centroid-aware local discriminative metric learning in speaker verification.

Author: Dong Weiming
Hu Baogang
Huang Feiyue
Li Wei
Razik Joseph
Sheng Kekai
Publication venue: 'Elsevier BV'
Publication date: 14/07/2017
Field of study

International audienceWe propose a new mechanism to pave the way for efficient learning against class-imbalance and improve representation of identity vector (i-vector) in automatic speaker verification (ASV). The insight is to effectively exploit the inherent structure within ASV corpus — centroid priori. In particular: (1) to ensure learning efficiency against class-imbalance, the centroid-aware balanced boosting sampling is proposed to collect balanced mini-batch; (2) to strengthen local discriminative modeling on the mini-batches, neighborhood component analysis (NCA) and magnet loss (MNL) are adopted in ASV-specific modifications. The integration creates adaptive NCA (AdaNCA) and linear MNL (LMNL). Numerical results show that LMNL is a competitive candidate for low-dimensional projection on i-vector (EER = 3.84% on SRE2008, EER = 1.81% on SRE2010), enjoying competitive edge over linear discriminant analysis (LDA). AdaNCA (EER = 4.03% on SRE2008, EER = 2.05% on SRE2010) also performs well. Furthermore, to facilitate the future study on boosting sampling, connections between boosting sampling, hinge loss and data augmentation have been established, which help understand the behavior of boosting sampling further

HAL AMU

INRIA a CCSD electronic archive server

HAL-CIRAD

Generalizable Representation Learning for Mixture Domain Face Anti-Spoofing

Author: Chen Zhihong
Ding Shouhong
Huang Feiyue
Jin Xinyu
Li Jilin
Sheng Kekai
Tai Ying
Yao Taiping
Publication venue: Association for the Advancement of Artificial Intelligence
Publication date: 18/05/2021
Field of study

Face anti-spoofing approach based on domain generalization (DG) has drawn growing attention due to its robustness for unseen scenarios. Existing DG methods assume that the domain label is known. However, in real-world applications, the collected dataset always contains mixture domains, where the domain label is unknown. In this case, most of existing methods may not work. Further, even if we can obtain the domain label as existing methods, we think this is just a sub-optimal partition. To overcome the limitation, we propose domain dynamic adjustment meta-learning (D

^2

AM) without using domain labels, which iteratively divides mixture domains via discriminative domain representation and trains a generalizable face anti-spoofing with meta-learning. Specifically, we design a domain feature based on Instance Normalization (IN) and propose a domain representation learning module (DRLM) to extract discriminative domain features for clustering. Moreover, to reduce the side effect of outliers on clustering performance, we additionally utilize maximum mean discrepancy (MMD) to align the distribution of sample features to a prior distribution, which improves the reliability of clustering. Extensive experiments show that the proposed method outperforms conventional DG-based face anti-spoofing methods, including those utilizing domain labels. Furthermore, we enhance the interpretability through visualization

Association for the Advancement of Artificial Intelligence: AAAI Publications

Centroid-aware local discriminative metric learning in speaker verification

Author: Bahari
Baogang Hu
Bautista
Bottou
Canévet
Chen
Davis
Dehak
Dehak
Feiyue Huang
Goldberger
Hadsell
Hansen
Hatch
Huang
Ioffe
Joseph Razik
Kekai Sheng
Kelly
Kenny
Kenny
Lack
Liang
Liu
Liu
Maaten
Manning
Montavon
Oh Song
Povey
Prince
Richardson
Rippel
Senoussaoui
Senoussaoui
Shen
Shrivastava
van der Maaten
Wei Li
Weiming Dong
Weinberger
Yang
Yang
Publication venue: 'Elsevier BV'
Publication date
Field of study

Crossref